Overview

Dataset statistics

Number of variables27
Number of observations10302
Missing cells3004
Missing cells (%)1.1%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory12.3 MiB
Average record size in memory1.2 KiB

Variable types

CAT13
NUM10
BOOL4

Reproduction

Analysis started2020-07-13 16:08:14.611306
Analysis finished2020-07-13 16:09:55.127399
Duration1 minute and 40.52 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

Dataset has 1 (< 0.1%) duplicate rows Duplicates
BIRTH has a high cardinality: 6560 distinct values High cardinality
INCOME has a high cardinality: 8151 distinct values High cardinality
HOME_VAL has a high cardinality: 6334 distinct values High cardinality
BLUEBOOK has a high cardinality: 2985 distinct values High cardinality
OLDCLAIM has a high cardinality: 3545 distinct values High cardinality
CLM_AMT has a high cardinality: 2346 distinct values High cardinality
YOJ has 548 (5.3%) missing values Missing
INCOME has 570 (5.5%) missing values Missing
HOME_VAL has 575 (5.6%) missing values Missing
OCCUPATION has 665 (6.5%) missing values Missing
CAR_AGE has 639 (6.2%) missing values Missing
BIRTH is uniformly distributed Uniform
KIDSDRIV has 9069 (88.0%) zeros Zeros
HOMEKIDS has 6694 (65.0%) zeros Zeros
YOJ has 807 (7.8%) zeros Zeros
CLM_FREQ has 6292 (61.1%) zeros Zeros
MVR_PTS has 4658 (45.2%) zeros Zeros

Variables

ID
Real number (ℝ≥0)

Distinct count8753
Unique (%)85.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean495663109.08318776
Minimum63175
Maximum999926368
Zeros0
Zeros (%)0.0%
Memory size80.6 KiB

Quantile statistics

Minimum63175
5-th percentile50696155.55
Q1244286856
median497004293
Q3739455069
95-th percentile944365215.5
Maximum999926368
Range999863193
Interquartile range (IQR)495168213

Descriptive statistics

Standard deviation286467479
Coefficient of variation (CV)0.5779479525
Kurtosis-1.192724113
Mean495663109.1
Median Absolute Deviation (MAD)246430252
Skewness0.005051427651
Sum5.10632135e+12
Variance8.206361654e+16
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3411628995< 0.1%
 
7475576905< 0.1%
 
6320672625< 0.1%
 
1731247595< 0.1%
 
6278283314< 0.1%
 
6563443354< 0.1%
 
1950669004< 0.1%
 
7507317524< 0.1%
 
464133774< 0.1%
 
7796241784< 0.1%
 
590261584< 0.1%
 
3031832484< 0.1%
 
6343431794< 0.1%
 
223405634< 0.1%
 
4305547744< 0.1%
 
8469047964< 0.1%
 
9838008114< 0.1%
 
7920364834< 0.1%
 
7998578994< 0.1%
 
5286984704< 0.1%
 
2919617074< 0.1%
 
555240884< 0.1%
 
4579871624< 0.1%
 
9312211044< 0.1%
 
1326096554< 0.1%
 
Other values (8728)1019899.0%
 
ValueCountFrequency (%) 
631751< 0.1%
 
2469101< 0.1%
 
4012761< 0.1%
 
8131282< 0.1%
 
13073712< 0.1%
 
15146971< 0.1%
 
15411491< 0.1%
 
16279731< 0.1%
 
17801861< 0.1%
 
18608851< 0.1%
 
ValueCountFrequency (%) 
9999263682< 0.1%
 
9998005371< 0.1%
 
9996402901< 0.1%
 
9995770841< 0.1%
 
9994826631< 0.1%
 
9994573981< 0.1%
 
9993318391< 0.1%
 
9991789591< 0.1%
 
9991691901< 0.1%
 
9991583401< 0.1%
 

KIDSDRIV
Real number (ℝ≥0)

ZEROS

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.16928751698699282
Minimum0
Maximum4
Zeros9069
Zeros (%)88.0%
Memory size80.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4
Range4
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5065115815
Coefficient of variation (CV)2.992019675
Kurtosis11.67651844
Mean0.169287517
Median Absolute Deviation (MAD)0
Skewness3.342867542
Sum1744
Variance0.2565539822
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0906988.0%
 
18047.8%
 
23513.4%
 
3740.7%
 
44< 0.1%
 
ValueCountFrequency (%) 
0906988.0%
 
18047.8%
 
23513.4%
 
3740.7%
 
44< 0.1%
 
ValueCountFrequency (%) 
44< 0.1%
 
3740.7%
 
23513.4%
 
18047.8%
 
0906988.0%
 

BIRTH
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count6560
Unique (%)63.7%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
20-Oct-60
 
8
10-Oct-53
 
7
13-Sep-51
 
6
16-Jul-59
 
6
7-Feb-48
 
6
Other values (6555)
10269
ValueCountFrequency (%) 
20-Oct-6080.1%
 
10-Oct-5370.1%
 
13-Sep-5160.1%
 
16-Jul-5960.1%
 
7-Feb-4860.1%
 
14-Feb-5560.1%
 
2-Feb-5160.1%
 
31-Jul-5460.1%
 
14-Dec-6560.1%
 
11-Jan-6060.1%
 
1-Oct-4460.1%
 
6-Feb-5460.1%
 
17-Jul-4660.1%
 
25-Mar-5460.1%
 
23-Aug-6060.1%
 
14-May-4760.1%
 
6-Nov-4960.1%
 
18-Feb-6060.1%
 
24-Oct-605< 0.1%
 
27-Aug-535< 0.1%
 
13-May-515< 0.1%
 
20-Nov-515< 0.1%
 
30-Nov-435< 0.1%
 
2-Jun-525< 0.1%
 
22-Dec-535< 0.1%
 
Other values (6535)1015698.6%
 

Length

Max length9
Median length9
Mean length8.706950107
Min length8

Overview of Unicode Properties

Unique unicode characters33
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
-2060423.0%
 
563037.0%
 
156736.3%
 
254156.0%
 
447255.3%
 
644445.0%
 
331383.5%
 
u26362.9%
 
J26282.9%
 
a26032.9%
 
e25572.9%
 
724472.7%
 
820382.3%
 
920122.2%
 
019942.2%
 
c17922.0%
 
M17502.0%
 
r17271.9%
 
n16941.9%
 
A16671.9%
 
p16411.8%
 
l9341.0%
 
D9081.0%
 
O8841.0%
 
t8841.0%
 
Other values (8)66017.4%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number3818942.6%
 
Dash Punctuation2060423.0%
 
Lowercase Letter2060423.0%
 
Uppercase Letter1030211.5%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
5630316.5%
 
1567314.9%
 
2541514.2%
 
4472512.4%
 
6444411.6%
 
331388.2%
 
724476.4%
 
820385.3%
 
920125.3%
 
019945.2%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-20604100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
J262825.5%
 
M175017.0%
 
A166716.2%
 
D9088.8%
 
O8848.6%
 
S8358.1%
 
N8167.9%
 
F8147.9%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
u263612.8%
 
a260312.6%
 
e255712.4%
 
c17928.7%
 
r17278.4%
 
n16948.2%
 
p16418.0%
 
l9344.5%
 
t8844.3%
 
g8614.2%
 
y8294.0%
 
o8164.0%
 
v8164.0%
 
b8144.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common5879365.5%
 
Latin3090634.5%
 

Most frequent Common characters

ValueCountFrequency (%) 
-2060435.0%
 
5630310.7%
 
156739.6%
 
254159.2%
 
447258.0%
 
644447.6%
 
331385.3%
 
724474.2%
 
820383.5%
 
920123.4%
 
019943.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
u26368.5%
 
J26288.5%
 
a26038.4%
 
e25578.3%
 
c17925.8%
 
M17505.7%
 
r17275.6%
 
n16945.5%
 
A16675.4%
 
p16415.3%
 
l9343.0%
 
D9082.9%
 
O8842.9%
 
t8842.9%
 
g8612.8%
 
S8352.7%
 
y8292.7%
 
N8162.6%
 
o8162.6%
 
v8162.6%
 
F8142.6%
 
b8142.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII89699100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
-2060423.0%
 
563037.0%
 
156736.3%
 
254156.0%
 
447255.3%
 
644445.0%
 
331383.5%
 
u26362.9%
 
J26282.9%
 
a26032.9%
 
e25572.9%
 
724472.7%
 
820382.3%
 
920122.2%
 
019942.2%
 
c17922.0%
 
M17502.0%
 
r17271.9%
 
n16941.9%
 
A16671.9%
 
p16411.8%
 
l9341.0%
 
D9081.0%
 
O8841.0%
 
t8841.0%
 
Other values (8)66017.4%
 

AGE
Real number (ℝ≥0)

Distinct count61
Unique (%)0.6%
Missing7
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean44.83739679456047
Minimum16.0
Maximum81.0
Zeros0
Zeros (%)0.0%
Memory size80.6 KiB

Quantile statistics

Minimum16
5-th percentile30
Q139
median45
Q351
95-th percentile59
Maximum81
Range65
Interquartile range (IQR)12

Descriptive statistics

Standard deviation8.60644502
Coefficient of variation (CV)0.1919479193
Kurtosis-0.08090259645
Mean44.83739679
Median Absolute Deviation (MAD)6
Skewness-0.03454065479
Sum461601
Variance74.07089588
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
464964.8%
 
454884.7%
 
484644.5%
 
474514.4%
 
434414.3%
 
414294.2%
 
504244.1%
 
444234.1%
 
404063.9%
 
424043.9%
 
393933.8%
 
493923.8%
 
513863.7%
 
383523.4%
 
533403.3%
 
523223.1%
 
373063.0%
 
363022.9%
 
542712.6%
 
552642.6%
 
352452.4%
 
572152.1%
 
332142.1%
 
562072.0%
 
341961.9%
 
Other values (36)146414.2%
 
ValueCountFrequency (%) 
165< 0.1%
 
172< 0.1%
 
183< 0.1%
 
1980.1%
 
204< 0.1%
 
21120.1%
 
22170.2%
 
23120.1%
 
24250.2%
 
25320.3%
 
ValueCountFrequency (%) 
811< 0.1%
 
801< 0.1%
 
761< 0.1%
 
734< 0.1%
 
724< 0.1%
 
711< 0.1%
 
7060.1%
 
695< 0.1%
 
6880.1%
 
67160.2%
 

HOMEKIDS
Real number (ℝ≥0)

ZEROS

Distinct count6
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.720442632498544
Minimum0
Maximum5
Zeros6694
Zeros (%)65.0%
Memory size80.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.116323221
Coefficient of variation (CV)1.549496339
Kurtosis0.6293463987
Mean0.7204426325
Median Absolute Deviation (MAD)0
Skewness1.336677637
Sum7422
Variance1.246177534
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0669465.0%
 
2142713.9%
 
1110610.7%
 
38568.3%
 
42012.0%
 
5180.2%
 
ValueCountFrequency (%) 
0669465.0%
 
1110610.7%
 
2142713.9%
 
38568.3%
 
42012.0%
 
5180.2%
 
ValueCountFrequency (%) 
5180.2%
 
42012.0%
 
38568.3%
 
2142713.9%
 
1110610.7%
 
0669465.0%
 

YOJ
Real number (ℝ≥0)

MISSING
ZEROS

Distinct count21
Unique (%)0.2%
Missing548
Missing (%)5.3%
Infinite0
Infinite (%)0.0%
Mean10.474061923313512
Minimum0.0
Maximum23.0
Zeros807
Zeros (%)7.8%
Memory size80.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q19
median11
Q313
95-th percentile15
Maximum23
Range23
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.108943185
Coefficient of variation (CV)0.3922970109
Kurtosis1.144802147
Mean10.47406192
Median Absolute Deviation (MAD)2
Skewness-1.200822881
Sum102164
Variance16.88341409
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
12150014.6%
 
11126712.3%
 
13126612.3%
 
149969.7%
 
109349.1%
 
08077.8%
 
96536.3%
 
155835.7%
 
84844.7%
 
73843.7%
 
162432.4%
 
62192.1%
 
171271.2%
 
51241.2%
 
4490.5%
 
3380.4%
 
18330.3%
 
2210.2%
 
19170.2%
 
170.1%
 
232< 0.1%
 
(Missing)5485.3%
 
ValueCountFrequency (%) 
08077.8%
 
170.1%
 
2210.2%
 
3380.4%
 
4490.5%
 
51241.2%
 
62192.1%
 
73843.7%
 
84844.7%
 
96536.3%
 
ValueCountFrequency (%) 
232< 0.1%
 
19170.2%
 
18330.3%
 
171271.2%
 
162432.4%
 
155835.7%
 
149969.7%
 
13126612.3%
 
12150014.6%
 
11126712.3%
 

INCOME
Categorical

HIGH CARDINALITY
MISSING

Distinct count8151
Unique (%)83.8%
Missing570
Missing (%)5.5%
Memory size80.6 KiB
$0
 
797
$61,790
 
5
$30,111
 
4
$64,916
 
4
$48,509
 
4
Other values (8146)
8918
ValueCountFrequency (%) 
$0 7977.7%
 
$61,790 5< 0.1%
 
$30,111 4< 0.1%
 
$64,916 4< 0.1%
 
$48,509 4< 0.1%
 
$43,393 4< 0.1%
 
$26,840 4< 0.1%
 
$54,691 3< 0.1%
 
$35,362 3< 0.1%
 
$34,658 3< 0.1%
 
$144,348 3< 0.1%
 
$67,700 3< 0.1%
 
$33,303 3< 0.1%
 
$48,741 3< 0.1%
 
$50,166 3< 0.1%
 
$107,375 3< 0.1%
 
$50,401 3< 0.1%
 
$82,398 3< 0.1%
 
$64,032 3< 0.1%
 
$142,297 3< 0.1%
 
$35,263 3< 0.1%
 
$63,357 3< 0.1%
 
$62,894 3< 0.1%
 
$2,346 3< 0.1%
 
$40,990 3< 0.1%
 
Other values (8126)885686.0%
 
(Missing)5705.5%
 

Length

Max length9
Median length8
Mean length7.454086585
Min length3

Overview of Unicode Properties

Unique unicode characters15
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
$973212.7%
 
973212.7%
 
,888411.6%
 
160167.8%
 
249146.4%
 
348516.3%
 
047996.2%
 
446426.0%
 
545976.0%
 
645315.9%
 
742525.5%
 
940735.3%
 
840595.3%
 
n11401.5%
 
a5700.7%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number4673460.9%
 
Currency Symbol973212.7%
 
Space Separator973212.7%
 
Other Punctuation888411.6%
 
Lowercase Letter17102.2%
 

Most frequent Currency Symbol characters

ValueCountFrequency (%) 
$9732100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
1601612.9%
 
2491410.5%
 
3485110.4%
 
0479910.3%
 
446429.9%
 
545979.8%
 
645319.7%
 
742529.1%
 
940738.7%
 
840598.7%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,8884100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
9732100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n114066.7%
 
a57033.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common7508297.8%
 
Latin17102.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
$973213.0%
 
973213.0%
 
,888411.8%
 
160168.0%
 
249146.5%
 
348516.5%
 
047996.4%
 
446426.2%
 
545976.1%
 
645316.0%
 
742525.7%
 
940735.4%
 
840595.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n114066.7%
 
a57033.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII76792100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
$973212.7%
 
973212.7%
 
,888411.6%
 
160167.8%
 
249146.4%
 
348516.3%
 
047996.2%
 
446426.0%
 
545976.0%
 
645315.9%
 
742525.5%
 
940735.3%
 
840595.3%
 
n11401.5%
 
a5700.7%
 

PARENT1
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
No
8959
Yes
 
1343
ValueCountFrequency (%) 
No895987.0%
 
Yes134313.0%
 

HOME_VAL
Categorical

HIGH CARDINALITY
MISSING

Distinct count6334
Unique (%)65.1%
Missing575
Missing (%)5.6%
Memory size80.6 KiB
$0
2908
$227,138
 
3
$99,103
 
3
$205,130
 
3
$339,052
 
3
Other values (6329)
6807
ValueCountFrequency (%) 
$0 290828.2%
 
$227,138 3< 0.1%
 
$99,103 3< 0.1%
 
$205,130 3< 0.1%
 
$339,052 3< 0.1%
 
$123,109 3< 0.1%
 
$167,505 3< 0.1%
 
$173,130 3< 0.1%
 
$513,817 3< 0.1%
 
$196,320 3< 0.1%
 
$117,038 3< 0.1%
 
$238,724 3< 0.1%
 
$121,949 3< 0.1%
 
$189,439 3< 0.1%
 
$288,592 3< 0.1%
 
$166,481 3< 0.1%
 
$165,641 3< 0.1%
 
$159,568 3< 0.1%
 
$178,852 3< 0.1%
 
$183,581 3< 0.1%
 
$222,957 3< 0.1%
 
$332,673 3< 0.1%
 
$115,249 3< 0.1%
 
$245,878 3< 0.1%
 
$154,672 3< 0.1%
 
Other values (6309)674765.5%
 
(Missing)5755.6%
 

Length

Max length9
Median length9
Mean length6.92603378
Min length3

Overview of Unicode Properties

Unique unicode characters15
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
$972713.6%
 
972713.6%
 
,68199.6%
 
063618.9%
 
161418.6%
 
258728.2%
 
341935.9%
 
435915.0%
 
534964.9%
 
834474.8%
 
634324.8%
 
934264.8%
 
733954.8%
 
n11501.6%
 
a5750.8%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number4335460.8%
 
Currency Symbol972713.6%
 
Space Separator972713.6%
 
Other Punctuation68199.6%
 
Lowercase Letter17252.4%
 

Most frequent Currency Symbol characters

ValueCountFrequency (%) 
$9727100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0636114.7%
 
1614114.2%
 
2587213.5%
 
341939.7%
 
435918.3%
 
534968.1%
 
834478.0%
 
634327.9%
 
934267.9%
 
733957.8%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
9727100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,6819100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n115066.7%
 
a57533.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common6962797.6%
 
Latin17252.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
$972714.0%
 
972714.0%
 
,68199.8%
 
063619.1%
 
161418.8%
 
258728.4%
 
341936.0%
 
435915.2%
 
534965.0%
 
834475.0%
 
634324.9%
 
934264.9%
 
733954.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n115066.7%
 
a57533.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII71352100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
$972713.6%
 
972713.6%
 
,68199.6%
 
063618.9%
 
161418.6%
 
258728.2%
 
341935.9%
 
435915.0%
 
534964.9%
 
834474.8%
 
634324.8%
 
934264.8%
 
733954.8%
 
n11501.6%
 
a5750.8%
 

MSTATUS
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
Yes
6188
z_No
4114
ValueCountFrequency (%) 
Yes618860.1%
 
z_No411439.9%
 

Length

Max length4
Median length3
Mean length3.399339934
Min length3

Overview of Unicode Properties

Unique unicode characters7
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
Y618817.7%
 
e618817.7%
 
s618817.7%
 
z411411.7%
 
_411411.7%
 
N411411.7%
 
o411411.7%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter2060458.8%
 
Uppercase Letter1030229.4%
 
Connector Punctuation411411.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e618830.0%
 
s618830.0%
 
z411420.0%
 
o411420.0%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_4114100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
Y618860.1%
 
N411439.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin3090688.3%
 
Common411411.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
Y618820.0%
 
e618820.0%
 
s618820.0%
 
z411413.3%
 
N411413.3%
 
o411413.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
_4114100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII35020100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
Y618817.7%
 
e618817.7%
 
s618817.7%
 
z411411.7%
 
_411411.7%
 
N411411.7%
 
o411411.7%
 

GENDER
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
z_F
5545
M
4757
ValueCountFrequency (%) 
z_F554553.8%
 
M475746.2%
 

Length

Max length3
Median length3
Mean length2.076490002
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
z554525.9%
 
_554525.9%
 
F554525.9%
 
M475722.2%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter1030248.2%
 
Lowercase Letter554525.9%
 
Connector Punctuation554525.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
F554553.8%
 
M475746.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
z5545100.0%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_5545100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin1584774.1%
 
Common554525.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
z554535.0%
 
F554535.0%
 
M475730.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
_5545100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII21392100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
z554525.9%
 
_554525.9%
 
F554525.9%
 
M475722.2%
 

EDUCATION
Categorical

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
z_High School
2952
Bachelors
2823
Masters
2078
<High School
1515
PhD
934
ValueCountFrequency (%) 
z_High School295228.7%
 
Bachelors282327.4%
 
Masters207820.2%
 
<High School151514.7%
 
PhD9349.1%
 

Length

Max length13
Median length9
Mean length9.639972821
Min length3

Overview of Unicode Properties

Unique unicode characters21
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
h1269112.8%
 
o1175711.8%
 
c72907.3%
 
l72907.3%
 
s69797.0%
 
a49014.9%
 
e49014.9%
 
r49014.9%
 
H44674.5%
 
i44674.5%
 
g44674.5%
 
44674.5%
 
S44674.5%
 
z29523.0%
 
_29523.0%
 
B28232.8%
 
M20782.1%
 
t20782.1%
 
<15151.5%
 
P9340.9%
 
D9340.9%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter7467475.2%
 
Uppercase Letter1570315.8%
 
Space Separator44674.5%
 
Connector Punctuation29523.0%
 
Math Symbol15151.5%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
H446728.4%
 
S446728.4%
 
B282318.0%
 
M207813.2%
 
P9345.9%
 
D9345.9%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
h1269117.0%
 
o1175715.7%
 
c72909.8%
 
l72909.8%
 
s69799.3%
 
a49016.6%
 
e49016.6%
 
r49016.6%
 
i44676.0%
 
g44676.0%
 
z29524.0%
 
t20782.8%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_2952100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
4467100.0%
 

Most frequent Math Symbol characters

ValueCountFrequency (%) 
<1515100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin9037791.0%
 
Common89349.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
h1269114.0%
 
o1175713.0%
 
c72908.1%
 
l72908.1%
 
s69797.7%
 
a49015.4%
 
e49015.4%
 
r49015.4%
 
H44674.9%
 
i44674.9%
 
g44674.9%
 
S44674.9%
 
z29523.3%
 
B28233.1%
 
M20782.3%
 
t20782.3%
 
P9341.0%
 
D9341.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
446750.0%
 
_295233.0%
 
<151517.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII99311100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
h1269112.8%
 
o1175711.8%
 
c72907.3%
 
l72907.3%
 
s69797.0%
 
a49014.9%
 
e49014.9%
 
r49014.9%
 
H44674.5%
 
i44674.5%
 
g44674.5%
 
44674.5%
 
S44674.5%
 
z29523.0%
 
_29523.0%
 
B28232.8%
 
M20782.1%
 
t20782.1%
 
<15151.5%
 
P9340.9%
 
D9340.9%
 

OCCUPATION
Categorical

MISSING

Distinct count8
Unique (%)0.1%
Missing665
Missing (%)6.5%
Memory size80.6 KiB
z_Blue Collar
2288
Clerical
1590
Professional
1408
Manager
1257
Lawyer
1031
Other values (3)
2063
ValueCountFrequency (%) 
z_Blue Collar228822.2%
 
Clerical159015.4%
 
Professional140813.7%
 
Manager125712.2%
 
Lawyer103110.0%
 
Student8998.7%
 
Home Maker8438.2%
 
Doctor3213.1%
 
(Missing)6656.5%
 

Length

Max length13
Median length8
Mean length9.026305572
Min length3

Overview of Unicode Properties

Unique unicode characters29
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
l1145212.3%
 
a1033911.1%
 
e1015910.9%
 
r87389.4%
 
o65897.1%
 
n48945.3%
 
C38784.2%
 
u31873.4%
 
31313.4%
 
i29983.2%
 
s28163.0%
 
z22882.5%
 
_22882.5%
 
B22882.5%
 
t21192.3%
 
M21002.3%
 
c19112.1%
 
P14081.5%
 
f14081.5%
 
g12571.4%
 
L10311.1%
 
w10311.1%
 
y10311.1%
 
S8991.0%
 
d8991.0%
 
Other values (4)28503.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter7480280.4%
 
Uppercase Letter1276813.7%
 
Space Separator31313.4%
 
Connector Punctuation22882.5%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C387830.4%
 
B228817.9%
 
M210016.4%
 
P140811.0%
 
L10318.1%
 
S8997.0%
 
H8436.6%
 
D3212.5%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
l1145215.3%
 
a1033913.8%
 
e1015913.6%
 
r873811.7%
 
o65898.8%
 
n48946.5%
 
u31874.3%
 
i29984.0%
 
s28163.8%
 
z22883.1%
 
t21192.8%
 
c19112.6%
 
f14081.9%
 
g12571.7%
 
w10311.4%
 
y10311.4%
 
d8991.2%
 
m8431.1%
 
k8431.1%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_2288100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
3131100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin8757094.2%
 
Common54195.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
l1145213.1%
 
a1033911.8%
 
e1015911.6%
 
r873810.0%
 
o65897.5%
 
n48945.6%
 
C38784.4%
 
u31873.6%
 
i29983.4%
 
s28163.2%
 
z22882.6%
 
B22882.6%
 
t21192.4%
 
M21002.4%
 
c19112.2%
 
P14081.6%
 
f14081.6%
 
g12571.4%
 
L10311.2%
 
w10311.2%
 
y10311.2%
 
S8991.0%
 
d8991.0%
 
H8431.0%
 
m8431.0%
 
Other values (2)11641.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
313157.8%
 
_228842.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII92989100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
l1145212.3%
 
a1033911.1%
 
e1015910.9%
 
r87389.4%
 
o65897.1%
 
n48945.3%
 
C38784.2%
 
u31873.4%
 
31313.4%
 
i29983.2%
 
s28163.0%
 
z22882.5%
 
_22882.5%
 
B22882.5%
 
t21192.3%
 
M21002.3%
 
c19112.1%
 
P14081.5%
 
f14081.5%
 
g12571.4%
 
L10311.1%
 
w10311.1%
 
y10311.1%
 
S8991.0%
 
d8991.0%
 
Other values (4)28503.1%
 

TRAVTIME
Real number (ℝ≥0)

Distinct count100
Unique (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.41642399534071
Minimum5
Maximum142
Zeros0
Zeros (%)0.0%
Memory size80.6 KiB

Quantile statistics

Minimum5
5-th percentile7
Q122
median33
Q344
95-th percentile60
Maximum142
Range137
Interquartile range (IQR)22

Descriptive statistics

Standard deviation15.86968685
Coefficient of variation (CV)0.4749067958
Kurtosis0.5946562503
Mean33.416424
Median Absolute Deviation (MAD)11
Skewness0.4355271578
Sum344256
Variance251.8469606
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
54274.1%
 
322882.8%
 
352712.6%
 
332682.6%
 
362662.6%
 
302652.6%
 
372642.6%
 
292592.5%
 
252572.5%
 
242532.5%
 
342472.4%
 
262432.4%
 
402422.3%
 
312392.3%
 
382362.3%
 
282312.2%
 
272252.2%
 
392252.2%
 
412212.1%
 
432132.1%
 
212082.0%
 
222052.0%
 
232052.0%
 
451951.9%
 
441931.9%
 
Other values (75)415640.3%
 
ValueCountFrequency (%) 
54274.1%
 
6660.6%
 
7560.5%
 
8690.7%
 
9860.8%
 
101011.0%
 
11900.9%
 
121221.2%
 
131211.2%
 
141331.3%
 
ValueCountFrequency (%) 
1421< 0.1%
 
1341< 0.1%
 
1241< 0.1%
 
1131< 0.1%
 
1051< 0.1%
 
1031< 0.1%
 
1011< 0.1%
 
991< 0.1%
 
981< 0.1%
 
972< 0.1%
 

CAR_USE
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
Private
6513
Commercial
3789
ValueCountFrequency (%) 
Private651363.2%
 
Commercial378936.8%
 

Length

Max length10
Median length7
Mean length8.103377985
Min length7

Overview of Unicode Properties

Unique unicode characters12
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
r1030212.3%
 
i1030212.3%
 
a1030212.3%
 
e1030212.3%
 
m75789.1%
 
P65137.8%
 
v65137.8%
 
t65137.8%
 
C37894.5%
 
o37894.5%
 
c37894.5%
 
l37894.5%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter7317987.7%
 
Uppercase Letter1030212.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
P651363.2%
 
C378936.8%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
r1030214.1%
 
i1030214.1%
 
a1030214.1%
 
e1030214.1%
 
m757810.4%
 
v65138.9%
 
t65138.9%
 
o37895.2%
 
c37895.2%
 
l37895.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin83481100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
r1030212.3%
 
i1030212.3%
 
a1030212.3%
 
e1030212.3%
 
m75789.1%
 
P65137.8%
 
v65137.8%
 
t65137.8%
 
C37894.5%
 
o37894.5%
 
c37894.5%
 
l37894.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII83481100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
r1030212.3%
 
i1030212.3%
 
a1030212.3%
 
e1030212.3%
 
m75789.1%
 
P65137.8%
 
v65137.8%
 
t65137.8%
 
C37894.5%
 
o37894.5%
 
c37894.5%
 
l37894.5%
 

BLUEBOOK
Categorical

HIGH CARDINALITY

Distinct count2985
Unique (%)29.0%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
$1,500
 
207
$6,200
 
47
$6,000
 
42
$5,800
 
39
$5,400
 
38
Other values (2980)
9929
ValueCountFrequency (%) 
$1,500 2072.0%
 
$6,200 470.5%
 
$6,000 420.4%
 
$5,800 390.4%
 
$5,400 380.4%
 
$5,600 380.4%
 
$5,900 370.4%
 
$6,500 360.3%
 
$5,700 360.3%
 
$6,100 350.3%
 
$6,400 350.3%
 
$6,800 310.3%
 
$6,600 300.3%
 
$6,300 290.3%
 
$5,300 280.3%
 
$5,500 270.3%
 
$5,100 240.2%
 
$5,000 220.2%
 
$7,200 220.2%
 
$5,200 220.2%
 
$7,000 210.2%
 
$7,100 210.2%
 
$6,700 200.2%
 
$6,900 190.2%
 
$4,900 180.2%
 
Other values (2960)937891.0%
 

Length

Max length8
Median length8
Mean length7.712871287
Min length7

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)4
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
01415217.8%
 
$1030213.0%
 
,1030213.0%
 
1030213.0%
 
175729.5%
 
250576.4%
 
334194.3%
 
533234.2%
 
631484.0%
 
430593.8%
 
730473.8%
 
829323.7%
 
928433.6%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number4855261.1%
 
Currency Symbol1030213.0%
 
Other Punctuation1030213.0%
 
Space Separator1030213.0%
 

Most frequent Currency Symbol characters

ValueCountFrequency (%) 
$10302100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
01415229.1%
 
1757215.6%
 
2505710.4%
 
334197.0%
 
533236.8%
 
631486.5%
 
430596.3%
 
730476.3%
 
829326.0%
 
928435.9%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,10302100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
10302100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common79458100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
01415217.8%
 
$1030213.0%
 
,1030213.0%
 
1030213.0%
 
175729.5%
 
250576.4%
 
334194.3%
 
533234.2%
 
631484.0%
 
430593.8%
 
730473.8%
 
829323.7%
 
928433.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII79458100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
01415217.8%
 
$1030213.0%
 
,1030213.0%
 
1030213.0%
 
175729.5%
 
250576.4%
 
334194.3%
 
533234.2%
 
631484.0%
 
430593.8%
 
730473.8%
 
829323.7%
 
928433.6%
 

TIF
Real number (ℝ≥0)

Distinct count23
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.329159386526888
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Memory size80.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median4
Q37
95-th percentile13
Maximum25
Range24
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.110794715
Coefficient of variation (CV)0.7713777009
Kurtosis0.4797085698
Mean5.329159387
Median Absolute Deviation (MAD)3
Skewness0.8994152554
Sum54901
Variance16.89863319
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1317230.8%
 
6170716.6%
 
4161615.7%
 
109519.2%
 
77817.6%
 
35315.2%
 
133553.4%
 
113002.9%
 
92992.9%
 
171261.2%
 
14920.9%
 
8830.8%
 
5700.7%
 
12550.5%
 
16500.5%
 
15400.4%
 
18260.3%
 
21130.1%
 
20120.1%
 
19110.1%
 
260.1%
 
253< 0.1%
 
223< 0.1%
 
ValueCountFrequency (%) 
1317230.8%
 
260.1%
 
35315.2%
 
4161615.7%
 
5700.7%
 
6170716.6%
 
77817.6%
 
8830.8%
 
92992.9%
 
109519.2%
 
ValueCountFrequency (%) 
253< 0.1%
 
223< 0.1%
 
21130.1%
 
20120.1%
 
19110.1%
 
18260.3%
 
171261.2%
 
16500.5%
 
15400.4%
 
14920.9%
 

CAR_TYPE
Categorical

Distinct count6
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
z_SUV
2883
Minivan
2694
Pickup
1772
Sports Car
1179
Van
921
ValueCountFrequency (%) 
z_SUV288328.0%
 
Minivan269426.2%
 
Pickup177217.2%
 
Sports Car117911.4%
 
Van9218.9%
 
Panel Truck8538.3%
 

Length

Max length11
Median length6
Mean length6.58522617
Min length3

Overview of Unicode Properties

Unique unicode characters24
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n716210.6%
 
i716010.6%
 
a56478.3%
 
S40626.0%
 
V38045.6%
 
r32114.7%
 
p29514.3%
 
z28834.2%
 
_28834.2%
 
U28834.2%
 
M26944.0%
 
v26944.0%
 
P26253.9%
 
u26253.9%
 
c26253.9%
 
k26253.9%
 
20323.0%
 
o11791.7%
 
t11791.7%
 
s11791.7%
 
C11791.7%
 
e8531.3%
 
l8531.3%
 
T8531.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter4482666.1%
 
Uppercase Letter1810026.7%
 
Connector Punctuation28834.2%
 
Space Separator20323.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S406222.4%
 
V380421.0%
 
U288315.9%
 
M269414.9%
 
P262514.5%
 
C11796.5%
 
T8534.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n716216.0%
 
i716016.0%
 
a564712.6%
 
r32117.2%
 
p29516.6%
 
z28836.4%
 
v26946.0%
 
u26255.9%
 
c26255.9%
 
k26255.9%
 
o11792.6%
 
t11792.6%
 
s11792.6%
 
e8531.9%
 
l8531.9%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_2883100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
2032100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin6292692.8%
 
Common49157.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n716211.4%
 
i716011.4%
 
a56479.0%
 
S40626.5%
 
V38046.0%
 
r32115.1%
 
p29514.7%
 
z28834.6%
 
U28834.6%
 
M26944.3%
 
v26944.3%
 
P26254.2%
 
u26254.2%
 
c26254.2%
 
k26254.2%
 
o11791.9%
 
t11791.9%
 
s11791.9%
 
C11791.9%
 
e8531.4%
 
l8531.4%
 
T8531.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
_288358.7%
 
203241.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII67841100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n716210.6%
 
i716010.6%
 
a56478.3%
 
S40626.0%
 
V38045.6%
 
r32114.7%
 
p29514.3%
 
z28834.2%
 
_28834.2%
 
U28834.2%
 
M26944.0%
 
v26944.0%
 
P26253.9%
 
u26253.9%
 
c26253.9%
 
k26253.9%
 
20323.0%
 
o11791.7%
 
t11791.7%
 
s11791.7%
 
C11791.7%
 
e8531.3%
 
l8531.3%
 
T8531.3%
 

RED_CAR
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
no
7326
yes
2976
ValueCountFrequency (%) 
no732671.1%
 
yes297628.9%
 

OLDCLAIM
Categorical

HIGH CARDINALITY

Distinct count3545
Unique (%)34.4%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
$0
6292
$1,391
 
4
$1,310
 
4
$4,188
 
4
$4,263
 
4
Other values (3540)
3994
ValueCountFrequency (%) 
$0 629261.1%
 
$1,391 4< 0.1%
 
$1,310 4< 0.1%
 
$4,188 4< 0.1%
 
$4,263 4< 0.1%
 
$1,105 4< 0.1%
 
$4,538 4< 0.1%
 
$4,448 4< 0.1%
 
$4,824 3< 0.1%
 
$4,423 3< 0.1%
 
$2,740 3< 0.1%
 
$6,985 3< 0.1%
 
$4,451 3< 0.1%
 
$5,289 3< 0.1%
 
$3,460 3< 0.1%
 
$6,935 3< 0.1%
 
$4,528 3< 0.1%
 
$5,863 3< 0.1%
 
$6,281 3< 0.1%
 
$3,863 3< 0.1%
 
$1,332 3< 0.1%
 
$4,582 3< 0.1%
 
$5,101 3< 0.1%
 
$8,174 3< 0.1%
 
$5,399 3< 0.1%
 
Other values (3520)393138.2%
 

Length

Max length8
Median length3
Mean length4.625703747
Min length3

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)4
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
$1030221.6%
 
1030221.6%
 
0763616.0%
 
,38828.1%
 
320124.2%
 
119634.1%
 
418153.8%
 
517693.7%
 
217633.7%
 
616263.4%
 
815523.3%
 
715463.2%
 
914863.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2316848.6%
 
Currency Symbol1030221.6%
 
Space Separator1030221.6%
 
Other Punctuation38828.1%
 

Most frequent Currency Symbol characters

ValueCountFrequency (%) 
$10302100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0763633.0%
 
320128.7%
 
119638.5%
 
418157.8%
 
517697.6%
 
217637.6%
 
616267.0%
 
815526.7%
 
715466.7%
 
914866.4%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,3882100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
10302100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common47654100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
$1030221.6%
 
1030221.6%
 
0763616.0%
 
,38828.1%
 
320124.2%
 
119634.1%
 
418153.8%
 
517693.7%
 
217633.7%
 
616263.4%
 
815523.3%
 
715463.2%
 
914863.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII47654100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
$1030221.6%
 
1030221.6%
 
0763616.0%
 
,38828.1%
 
320124.2%
 
119634.1%
 
418153.8%
 
517693.7%
 
217633.7%
 
616263.4%
 
815523.3%
 
715463.2%
 
914863.1%
 

CLM_FREQ
Real number (ℝ≥0)

ZEROS

Distinct count6
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8007183071248302
Minimum0
Maximum5
Zeros6292
Zeros (%)61.1%
Memory size80.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.154078575
Coefficient of variation (CV)1.441304095
Kurtosis0.2459181422
Mean0.8007183071
Median Absolute Deviation (MAD)0
Skewness1.194062397
Sum8249
Variance1.331897358
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0629261.1%
 
2149214.5%
 
1127912.4%
 
39929.6%
 
42252.2%
 
5220.2%
 
ValueCountFrequency (%) 
0629261.1%
 
1127912.4%
 
2149214.5%
 
39929.6%
 
42252.2%
 
5220.2%
 
ValueCountFrequency (%) 
5220.2%
 
42252.2%
 
39929.6%
 
2149214.5%
 
1127912.4%
 
0629261.1%
 

REVOKED
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
No
9041
Yes
 
1261
ValueCountFrequency (%) 
No904187.8%
 
Yes126112.2%
 

MVR_PTS
Real number (ℝ≥0)

ZEROS

Distinct count14
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.7101533682780043
Minimum0
Maximum13
Zeros4658
Zeros (%)45.2%
Memory size80.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile6
Maximum13
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.159014892
Coefficient of variation (CV)1.262468578
Kurtosis1.335837085
Mean1.710153368
Median Absolute Deviation (MAD)1
Skewness1.34050631
Sum17618
Variance4.661345302
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0465845.2%
 
1146714.2%
 
2119911.6%
 
39669.4%
 
47277.1%
 
55285.1%
 
63413.3%
 
72132.1%
 
81141.1%
 
9530.5%
 
10200.2%
 
11130.1%
 
132< 0.1%
 
121< 0.1%
 
ValueCountFrequency (%) 
0465845.2%
 
1146714.2%
 
2119911.6%
 
39669.4%
 
47277.1%
 
55285.1%
 
63413.3%
 
72132.1%
 
81141.1%
 
9530.5%
 
ValueCountFrequency (%) 
132< 0.1%
 
121< 0.1%
 
11130.1%
 
10200.2%
 
9530.5%
 
81141.1%
 
72132.1%
 
63413.3%
 
55285.1%
 
47277.1%
 

CLM_AMT
Categorical

HIGH CARDINALITY

Distinct count2346
Unique (%)22.8%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
$0
7556
$3,674
 
4
$3,667
 
4
$4,363
 
4
$2,327
 
4
Other values (2341)
2730
ValueCountFrequency (%) 
$0 755673.3%
 
$3,674 4< 0.1%
 
$3,667 4< 0.1%
 
$4,363 4< 0.1%
 
$2,327 4< 0.1%
 
$3,350 4< 0.1%
 
$2,979 3< 0.1%
 
$2,027 3< 0.1%
 
$2,641 3< 0.1%
 
$3,278 3< 0.1%
 
$7,893 3< 0.1%
 
$6,409 3< 0.1%
 
$5,692 3< 0.1%
 
$2,729 3< 0.1%
 
$2,489 3< 0.1%
 
$2,668 3< 0.1%
 
$3,008 3< 0.1%
 
$3,776 3< 0.1%
 
$1,493 3< 0.1%
 
$5,951 3< 0.1%
 
$1,842 3< 0.1%
 
$4,566 3< 0.1%
 
$5,900 3< 0.1%
 
$2,493 3< 0.1%
 
$5,453 3< 0.1%
 
Other values (2321)266925.9%
 

Length

Max length9
Median length3
Mean length4.060667831
Min length3

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)4
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
$1030224.6%
 
1030224.6%
 
0843320.2%
 
,26206.3%
 
313783.3%
 
412973.1%
 
212813.1%
 
112202.9%
 
512052.9%
 
610392.5%
 
79462.3%
 
89232.2%
 
98872.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number1860944.5%
 
Currency Symbol1030224.6%
 
Space Separator1030224.6%
 
Other Punctuation26206.3%
 

Most frequent Currency Symbol characters

ValueCountFrequency (%) 
$10302100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0843345.3%
 
313787.4%
 
412977.0%
 
212816.9%
 
112206.6%
 
512056.5%
 
610395.6%
 
79465.1%
 
89235.0%
 
98874.8%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
10302100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,2620100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common41833100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
$1030224.6%
 
1030224.6%
 
0843320.2%
 
,26206.3%
 
313783.3%
 
412973.1%
 
212813.1%
 
112202.9%
 
512052.9%
 
610392.5%
 
79462.3%
 
89232.2%
 
98872.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII41833100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
$1030224.6%
 
1030224.6%
 
0843320.2%
 
,26206.3%
 
313783.3%
 
412973.1%
 
212813.1%
 
112202.9%
 
512052.9%
 
610392.5%
 
79462.3%
 
89232.2%
 
98872.1%
 

CAR_AGE
Real number (ℝ)

MISSING

Distinct count30
Unique (%)0.3%
Missing639
Missing (%)6.2%
Infinite0
Infinite (%)0.0%
Mean8.298147573217427
Minimum-3.0
Maximum28.0
Zeros4
Zeros (%)< 0.1%
Memory size80.6 KiB

Quantile statistics

Minimum-3
5-th percentile1
Q11
median8
Q312
95-th percentile18
Maximum28
Range31
Interquartile range (IQR)11

Descriptive statistics

Standard deviation5.714450164
Coefficient of variation (CV)0.6886416654
Kurtosis-0.764329956
Mean8.298147573
Median Absolute Deviation (MAD)5
Skewness0.280460533
Sum80185
Variance32.65494068
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1248924.2%
 
86966.8%
 
96596.4%
 
76556.4%
 
106005.8%
 
65525.4%
 
115495.3%
 
124674.5%
 
134504.4%
 
143963.8%
 
153703.6%
 
53603.5%
 
162892.8%
 
172652.6%
 
181911.9%
 
41691.6%
 
191551.5%
 
201121.1%
 
3700.7%
 
21650.6%
 
22330.3%
 
23220.2%
 
2180.2%
 
24130.1%
 
2580.1%
 
Other values (5)100.1%
 
(Missing)6396.2%
 
ValueCountFrequency (%) 
-31< 0.1%
 
04< 0.1%
 
1248924.2%
 
2180.2%
 
3700.7%
 
41691.6%
 
53603.5%
 
65525.4%
 
76556.4%
 
86966.8%
 
ValueCountFrequency (%) 
281< 0.1%
 
271< 0.1%
 
263< 0.1%
 
2580.1%
 
24130.1%
 
23220.2%
 
22330.3%
 
21650.6%
 
201121.1%
 
191551.5%
 

CLAIM_FLAG
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
0
7556
1
2746
ValueCountFrequency (%) 
0755673.3%
 
1274626.7%
 

URBANICITY
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
Highly Urban/ Urban
8230
z_Highly Rural/ Rural
2072
ValueCountFrequency (%) 
Highly Urban/ Urban823079.9%
 
z_Highly Rural/ Rural207220.1%
 

Length

Max length21
Median length19
Mean length19.40225199
Min length19

Overview of Unicode Properties

Unique unicode characters17
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
2060410.3%
 
r2060410.3%
 
a2060410.3%
 
U164608.2%
 
b164608.2%
 
n164608.2%
 
l144467.2%
 
H103025.2%
 
i103025.2%
 
g103025.2%
 
h103025.2%
 
y103025.2%
 
/103025.2%
 
R41442.1%
 
u41442.1%
 
z20721.0%
 
_20721.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter13599868.0%
 
Uppercase Letter3090615.5%
 
Space Separator2060410.3%
 
Other Punctuation103025.2%
 
Connector Punctuation20721.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
U1646053.3%
 
H1030233.3%
 
R414413.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
r2060415.2%
 
a2060415.2%
 
b1646012.1%
 
n1646012.1%
 
l1444610.6%
 
i103027.6%
 
g103027.6%
 
h103027.6%
 
y103027.6%
 
u41443.0%
 
z20721.5%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
20604100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/10302100.0%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_2072100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin16690483.5%
 
Common3297816.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
r2060412.3%
 
a2060412.3%
 
U164609.9%
 
b164609.9%
 
n164609.9%
 
l144468.7%
 
H103026.2%
 
i103026.2%
 
g103026.2%
 
h103026.2%
 
y103026.2%
 
R41442.5%
 
u41442.5%
 
z20721.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
2060462.5%
 
/1030231.2%
 
_20726.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII199882100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
2060410.3%
 
r2060410.3%
 
a2060410.3%
 
U164608.2%
 
b164608.2%
 
n164608.2%
 
l144467.2%
 
H103025.2%
 
i103025.2%
 
g103025.2%
 
h103025.2%
 
y103025.2%
 
/103025.2%
 
R41442.1%
 
u41442.1%
 
z20721.0%
 
_20721.0%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

IDKIDSDRIVBIRTHAGEHOMEKIDSYOJINCOMEPARENT1HOME_VALMSTATUSGENDEREDUCATIONOCCUPATIONTRAVTIMECAR_USEBLUEBOOKTIFCAR_TYPERED_CAROLDCLAIMCLM_FREQREVOKEDMVR_PTSCLM_AMTCAR_AGECLAIM_FLAGURBANICITY
063581743016-Mar-3960.0011.0$67,349No$0z_NoMPhDProfessional14Private$14,23011Minivanyes$4,4612No3$018.00Highly Urban/ Urban
1132761049021-Jan-5643.0011.0$91,449No$257,252z_NoMz_High Schoolz_Blue Collar22Commercial$14,9401Minivanyes$00No0$01.00Highly Urban/ Urban
2921317019018-Nov-5148.0011.0$52,881No$0z_NoMBachelorsManager26Private$21,9701Vanyes$00No2$010.00Highly Urban/ Urban
372759847305-Mar-6435.0110.0$16,039No$124,191Yesz_Fz_High SchoolClerical5Private$4,0104z_SUVno$38,6902No3$010.00Highly Urban/ Urban
445022186105-Jun-4851.0014.0NaNNo$306,251YesM<High Schoolz_Blue Collar32Private$15,4407Minivanyes$00No0$06.00Highly Urban/ Urban
5743146596017-May-4950.00NaN$114,986No$243,925Yesz_FPhDDoctor36Private$18,0001z_SUVno$19,2172Yes3$017.00Highly Urban/ Urban
687102463105-May-6534.0112.0$125,301Yes$0z_Noz_FBachelorsz_Blue Collar46Commercial$17,4301Sports Carno$00No0$2,9467.01Highly Urban/ Urban
7792300541028-Feb-4554.00NaN$18,755NoNaNYesz_F<High Schoolz_Blue Collar33Private$8,7801z_SUVno$00No0$01.00Highly Urban/ Urban
87945239117-Sep-5940.0111.0$50,815Yes$0z_NoMz_High SchoolManager21Private$18,9306Minivanno$3,2951No2$6,4771.01Highly Urban/ Urban
93577610021-Aug-5544.0212.0$43,486Yes$0z_Noz_Fz_High Schoolz_Blue Collar30Commercial$5,90010z_SUVno$00No0$010.00z_Highly Rural/ Rural

Last rows

IDKIDSDRIVBIRTHAGEHOMEKIDSYOJINCOMEPARENT1HOME_VALMSTATUSGENDEREDUCATIONOCCUPATIONTRAVTIMECAR_USEBLUEBOOKTIFCAR_TYPERED_CAROLDCLAIMCLM_FREQREVOKEDMVR_PTSCLM_AMTCAR_AGECLAIM_FLAGURBANICITY
10292452807843019-Dec-5048.0010.0$111,305No$0z_Noz_FPhDDoctor59Private$17,43013z_SUVno$00No4$018.00Highly Urban/ Urban
10293814422920017-Aug-4851.0010.0$128,523No$0z_NoMMastersNaN18Commercial$32,9606Panel Truckno$3,9953No1$3,28815.01Highly Urban/ Urban
10294721196389118-Oct-6138.0416.0$12,717No$0Yesz_FBachelorsStudent15Commercial$24,7401Pickupno$9,2453No3$015.00Highly Urban/ Urban
10295215633551015-May-5841.007.0$6,256No$0z_NoMz_High SchoolStudent41Private$5,6001Pickupno$00No0$07.00z_Highly Rural/ Rural
1029612144157801-Jul-6435.0011.0$43,112No$0z_NoMz_High Schoolz_Blue Collar51Commercial$27,33010Panel Truckyes$00No0$08.00z_Highly Rural/ Rural
1029767790126113-Aug-5445.029.0$164,669No$386,273YesMPhDManager21Private$13,27015Minivanno$00No2$017.00Highly Urban/ Urban
1029861970712017-Jun-5346.009.0$107,204No$332,591YesMMastersNaN36Commercial$24,4906Panel Truckno$00No0$01.00Highly Urban/ Urban
10299849208064018-Jun-5148.0015.0$39,837No$170,611Yesz_F<High Schoolz_Blue Collar12Private$13,8207z_SUVno$00No0$01.00Highly Urban/ Urban
10300627828331012-Dec-4850.007.0$43,445No$149,248Yesz_FBachelorsHome Maker36Private$22,5506Minivanno$00No0$011.00Highly Urban/ Urban
10301680381960027-Feb-4752.0011.0$53,235No$197,017Yesz_Fz_High SchoolClerical64Private$19,4006Minivanno$00No0$09.00z_Highly Rural/ Rural

Duplicate rows

Most frequent

IDKIDSDRIVBIRTHAGEHOMEKIDSYOJINCOMEPARENT1HOME_VALMSTATUSGENDEREDUCATIONOCCUPATIONTRAVTIMECAR_USEBLUEBOOKTIFCAR_TYPERED_CAROLDCLAIMCLM_FREQREVOKEDMVR_PTSCLM_AMTCAR_AGECLAIM_FLAGURBANICITYcount
0279799481021-Feb-6039.0014.0$93,077No$244,764YesMBachelorsProfessional29Private$14,7101Minivanyes$00No0$01.00z_Highly Rural/ Rural2